Lenience towards Teammates Helps in Cooperative Multiagent Learning
نویسندگان
چکیده
Concurrent learning is a form of cooperative multiagent learning in which each agent has an independent learning process and little or no control over its teammates’ actions. In such learning algorithms, an agent’s perception of the joint search space depends on the reward received by both agents, which in turn depends on the actions currently chosen by the other agents. The agents will tend to converge towards certain areas of the space because of their learning processes. As a result, an agent’s perception of the search space may benefit if computed over multiple rewards at early stages of learning, but additional rewards have little impact towards the end. We thus suggest that agents should be lenient with their teammates: ignore many of the low rewards initially, and fewer rewards as learning progresses. We demonstrate the benefit of lenience in a cooperative coevolution algorithm and in a new reinforcement learning algorithm.
منابع مشابه
Autonomous Learning Agents: Layered Learning and Ad Hoc Teamwork
In order to achieve long-term autonomy in the real world, fully autonomous agents need to be able to learn, both to improve their behaviors in a complex, dynamically changing world, and to enable interaction with previously unfamiliar agents. This talk begins by presenting layered learning, a hierarchical machine learning paradigm that enables learning of complex behaviors by incrementally lear...
متن کاملGrounded Semantic Networks for Learning Shared Communication Protocols
Cooperative multiagent learning poses the challenge of coordinating independent agents. A powerful method to achieve coordination is allowing agents to communicate. We present the Grounded Semantic Network, an approach for learning a task-dependent communication protocol grounded in the observation space and reward function of the task. We show that the grounded semantic network effectively lea...
متن کاملEfficient Behavior Learning Based on State Value Estimation of Self and Others
The existing reinforcement learning methods have been seriously suffering from the curse of dimension problem especially when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behavior easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent enviro...
متن کاملEfficient Behavior Learning by Utilizing Estimated State Value of Self and Teammates
Reinforcement learning applications to real robots in multiagent dynamic environments are limited because of huge exploration space and enormously long learning time. One of the typical examples is a case of RoboCup competitions since other agents and their behavior easily cause state and action space explosion. This paper presents a method that utilizes state value functions of macro actions t...
متن کاملAdapting Plans through Communication with Unknown Teammates: (Doctoral Consortium)
Coordinating a team of autonomous agents is a challenging problem. Agents must act in such a way that makes progress toward the achievement of a goal while avoiding conflict with their teammates. In information asymmetric domains, it is often necessary to share crucial observations in order to collaborate effectively. In traditional multiagent systems literature, these teams of agents share an ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005